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A method for encoding and decoding image dependent water- 
marks 

Technical Field 

The present invention relates to methods of 
generating and decoding watermarks in images according to 
the preamble of the independent claims. 

Background Art 

The present invention relates to methods of 
generating and decoding image dependent watermarks in a 
novel way which simultaneously addresses one or more 
critical problems not solved by current methods. 

The idea of using a robust digital watermark 
to detect and trace copyright violations has stimulated 
significant interest among artists and publishers in re- 
cent years. Podilchuk (Podilchuk & Zeng 1998) gives three 
important requirements for an effective watermarking 
scheme: transparency, robustness and capacity. Transpar- 
ency refers to the fact that we would like the watermark 
to be invisible. The watermark should also be robust 
against a variety of possible image processing attacks by 
pirates. These include robustness against compression 
such as JPEG, scaling and aspect ratio changes, rotation, 
cropping, row and column removal, addition of noise, fil- 
tering, cryptographic and statistical attacks, as well as 
insertion of other watermarks (Petitcolas & Anderson 
1998) and the watermark copy attack proposed by Kutter 
(Kutter, Voloshynovskiy & Herrigel 2000) in which a wa- 
termark is estimated from one image and added to another 
one. 

The third requirement is that the watermark 
be able to carry a certain amount of information i.e. ca- 
pacity. In order to attach a unique identifier to each 
buyer of an image 1 , a typical watermark should be able to 
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carry at least 60-100 bits of information. However few 
publications deal with 60 or more bits. 

Watermarking methods can be divided into two 
broad categories: spatial domain methods such as {Bender, 
Gruhl & Morimoto 1996 7 Pitas 1996) and transform domain 
methods which have for the most part focused on DCT (Po- 
dilchuk & Zeng 1998, Barni et al. 1998), DFT (Pereira & 
Pun 1999, Barni, Bartolini, Rosa & Piva 1999) and most 
recently wavelet domain methods (Podilchuk & Zeng 1998, 
Barni, Bartolini, Cappellini, Lippi & Piva 19999, Zhu, 
Xiong & Zhang 1999). Transform domain methods have sev- 
eral advantages over spatial domain methods. Firstly, it 
has been observed that in order for watermarks to be ro- 
bust, they must be inserted into the perceptually sig- 
nificant parts of an image. For images these are the 
lower frequencies which can be marked directly if a 
transform domain approach is adopted (Cox, Killian, 
Leighton & Shamoon 1996). Secondly, since compression 
algorithms operate in the frequency domain (for example 
DCT for JPEG and wavelet for EZW) it is possible to opti- 
mize methods against compression algorithms. Thirdly, 
certain transforms are intrinsically robust to certain 
transformations. For example, the DFT domain has been 
successfully adopted in algorithms which attempt to re- 
cover watermarks from images which have undergone af fine 
transformations (Pereira & Pun 1999) . 

While transform domain watermarking clearly 
offers benefits, the problem is more challenging since it 
is more difficult to generate watermarks which are 
adapted to the human visual system (HVS) . One possibil- 
ity which has recently appeared is the attempt at speci- 
fying the mask in the transform domain (Podilchuk & Zeng 
1998) . Podilchuk and Zeng have accurately modeled the 
masking in both the wavelet and discrete cosine transform 
(DCT) domains where it is shown how to obtain the allow- 
able distortion at a given coefficient as a function of 
all other coefficients. 
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The existing technologies exhibit at least 
one of the following problems: 

1. Less than 60 bits are encoded. 

2. sub-optimal spatial domain modulation is 
applied to reduce visibility. 

3. The watermark is not image dependent and 
in particular does not resist against the watermark copy 
attack which estimates the watermark from one image and 
adds it to another. 

4. Uses an additive watermark which is easily 
copied, or attacked by denoising and perceptual remodu- 
lation as proposed by Voloshynovskiy (Voloshynovskiy, 
Herrigel, Baumgartner, Pereira & Pun 2000) . 

5. At embedding the image is treated as 

noise . 

It is the object of the present invention to 
provide a method of embedding a watermark which simulta- 
neously is capable of dealing with the 5 stated problems. 

In one aspect, the invention consists of for- 
mulating the problem as a constrained optimized problem, 
in which the optimization takes place over the watermark- 
ing domain with constraints on visibility posed in (pos- 
sibly) another domain. Furthermore, the image is not 
treated as noise, but as a sequence of known values which 
leads to a much better performance. 

In another aspect, in order to render the wa- 
termark image dependent, a coding scheme is described in 
which coding 1 or more bits depends explicitly on the 
values of one or more transform or spatial domain coeffi- 
cients. Since these coefficient vary from image to im- 
age, copying of the watermark will not result in a suc- 
cessful detection. In fact this coding scheme renders 
the watermark non-additive which is essential in resist- 
ing the copy attack. The non-additive and highly adap- 
tive nature also makes the watermark extremely robust. 

In yet another aspect, it is shown how to in- 
corporate the knowledge of JPEG quantization tables or 
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any other quantization tables such as MPEG, LZW or others 
in order to render the watermark more resistant to com- 
pression. 

In another aspect, we indicate how to apply 
the algorithm to video watermarking and music watermark- 
ing. 

Disclosure of the Invention 

It is the object of the present invention to 
provide a method of the type mentioned above that is ca- 
pable of dealing with at least some, preferably all of 
these problems. 

According to the present invention, the prob- 
lem is solved by the method of the independent claims. 

Preferred embodiments are described in the 
dependent claims. 

The present method is suited for watermarking 
still images or video data or music signals. While the 
primary goal of watermarking is copyright protection, the 
method is also suited for other applications such as 
steganography where we are interested in embedding infor- 
mation in a medium. 

Modes for Carrying Out the Invention 
Formulation of preferred embodiments : 

We formulate the embedding process as a con- 
strained optimization problem. We assume that we are 
given an image to be watermarked denoted X. If it is an 
RGB image we work with the luminance component though the 
same methodology can be applied to other color spaces 
where one or more of the color componenets are being wa- 
termarked. We are also given a masking function V(X) 
which returns a matrix of the same size of X containing 
the values A± t j corresponding to the amount by which co- 
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efficient (± t j) can be changed without being noticed. 
A can be computed in either a transform domain or the spa- 
tial domain by noise visibility functions NVF as proposed 
by Voloshynovskiy (Voloshynovskiy, Herrigel, Baumgaertner 
Sc Pun 1999) or other visual models such as those proposed 
by Osberger (Osberger, Bergmann & Maeder 1998) or Podil- 
chuk (Podilchuk & Zeng 1998). 

In the general case, the function V can be a 
complex function of texture, luminance, contrast, fre- 
quency and patterns. We wish to embed a binary message m 
where M is the number of bits in the message. In gen- 
eral, the binary message may first be augmented by a 
checksum and/or coded using error correction codes such 
as BCH or turbo codes to produce a message 
m c=( m l'™2- • -^Nt* of total length N t . Without loss of 
generality we assume the image X is of size 128x 128 cor- 
responding to a very small image. For larger images the 
same procedure is adopted for each 12 8x 128 large block. 

To embed the message, we first divide the im- 
age into 8x8 blocks and calculate the DCT of each block. 
In each 8x8 DCT block we embed % bits from m c . For each 
bit mi we select 2 coefficients ci and C2 based on a key, 
in which we will embed the information bit. For better 
performance each coefficient in a block should only be 
used once so that in general it is understood that for a 
different m± e c± and C2 are different. We recall that 
in order to ensure the watermark remains invisible, we 
must insist that c^ and c2 do not change more than the 
allowed A± r j for the coefficient as determined by the 
function V(Z) . 

From this basic setup, several strategies can 
be adopted to embed the message. In one embodiment, we 
encode a 1 by maximizing 1 0^02 \ while to encode a 0 we 
minimize |c 1 _c 2 (. The key advantage arises from the use 
of the absolute value. The main idea is that to maximize 
l c l- c 2l w ® will increase ci and decrease c2 if ci>c 2 oth- 
erwise we will increase c 2 ^ decrea se <=i. in order to 
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minimize |ci_C2i we will move cl and c2 so that they are 
as close as possible to being equal. It is also possible 
to move them both towards 0 although this will typically 
cost more energy. Whether we maximize or minimize we 
note that the embedding depends on the original values cl 
and c2 which vary from image to image. This is the key 
to rendering the scheme image dependent and thereby re- 
sistant to the copy attack. 

As is obvious to a person skilled in the art 
the maximization and minimization may be inverted at em- 
bedding. The compensation at decoding is straightfor- 
ward. Once the coefficients have been modified in the DCT 
domain, the inverse DCT block by block is computed to ob- 
tain the watermarked image in the spatial domain. 

In order to decode the watermark we simply 
calculate |ci_C2| associated which each bit and then com- 
pare it to a threshold T. If it is larger than T we as- 
sign 1 to the bit otherwise we assign 0. The error cor- 
rection codes are then decoded and the checksum tested if 
necessary. In a superior embodiment, rather than assign- 
ing 1 and 0 after comparison to a threshold (known as 
hard decoding) we can retain the values |ci_c 2 |-T and use 
them directly in the soft decoding of error correction 
codes which may yield a gain of more than 3dB in some 
cases. Typically the threshold T is calculated empiri- 
cally by testing the algorithm and choosing a T which 
yields the best performance. 

In the above simple embodiments we have used~ 
the function |ci_c 2 | for embedding. In other embodiments 
functions of the form | f (ci, c 2 , C3 1 C4.J | maybe interest- 
ing. In particular, rather than using just a difference 
of two coefficients, we may calculate a linear combina- 
tion of several coefficients prior to taking the absolute 
value. Another possibility is to multiply coefficients. 
Yet another embodiment consists of using several levels 
of absolute values. In other words, we would calculate 
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ll c l|-| c 2|l or the absolute value of a linear combination 
of absolute values in general, in all cases the princi- 
ple used for decoding remains the same. That is we cal- 
culate the function used at embedding and then use soft 
or hard decoding of error correction codes. Preferred 
embodiments would use turbo codes which approach optimal 
performance in Gaussian channels. 

In the above embodiments we have chosen to 
encode each bit separately by modulating a set of coeffi- 
cients as determined by a function. In a fundamentally 
different embodiment we can encode several bits by the 
maximization and minimizations of functions of the form 
l f (ci / c 2/ c 3/ c 4 ...) | . 

Table 1: Encoding of multiple bits 

associated function 

|ci-c 2 | 
|c3-c 4 | 

|c5-c 6 | 
|c7-cs| 

As an example of the encoding scheme, table 1 
associates pairs of bits with a given function which in 
this embodiment is just the absolute value of the differ- 
ence between 2 coefficients . If we would like to encode 
00 we would maximize |ci_C2| and minimize 1 03.04 1, |c 5 _ 
eg I and |c7_c 8 |. Clearly in other embodiments we may use 
more general functions of the form | f (c ly c 2 , C3 1 c 4 J | as 
described previously. The decoding for this embedding 
strategy is straightforward. We must simply calculate 
the associated functions and choose the bit pair which 
corresponds to the maximum. We also note that more than 
two bits can be used however the number of functions and 
therefore the encoding and decoding complexity required 
goes up exponentially. In other embodiments Reed-Solomon 
codes or low density parity check codes (LDPC) are used 


Bits 
00 
01 
10 
11 


WO 02/19269 


8 


PCT/CA01/01146 


when bits are grouped together at embedding since these 
tend to perform better in these situations. 

In what has been described, the DCT domain 
has been used, however any other domain may also be used 
for the embedding. In the case of spatial domain embed- 
ding, typically the local mean would first be removed. 
Although this is not necessary, if we do this, we obtain 
coefficients whose expected value is 0. This is the case 
for most transform domains since the mean is represented 
by one coefficient in a given block. In the case of the 
wavelet domain, the coefficients representing the local 
means, are contained in the lowest subband. Using zero 
mean coefficients considerably simplifies the embedding 
since the mean must no longer he accounted for at decod- 
ing. 

We note that it is now well known that a 
synchronization pattern can be added to the watermark. 
At decoding the synchronizing pattern is searched for. 
If the image has undergone geometrical transformations, 
these are compensated for and then the watermark is de- 
coded. An example of a synchronization pattern commonly 
used is a set of peaks in the Fourier transform domain as 
done in (Pereira & Pun 1999). Consequently in all the 
embodiments, it is to be understood that synchronization 
patterns can be done with little or no effect on the wa- 
termark itself since the energy used in the pattern is 
typically much less than the energy of the watermark. 

While there are shown and described presently 
preferred embodiments of the invention, it is to be dis- 
tinctly understood that the invention is not limited 
thereto but may be otherwise variously embodied and prac- 
ticed within the scope of the following claims. 
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Claims 

1. A method for embedding a watermark in a 
still or non-still image or music signal or other digital 
signal I comprising the steps of: 

selecting coefficients c± from a set c = {ci, ... c N } 
of coefficients, wherein said set of coefficients corre- 
sponds to said signal I or a transform thereof, and 
wherein said coefficients ci are selected to be used for 
encoding a message m c or a message derived from by use 
of error correction codes, and modifying each of said se- 
lected coefficients ci to encode one or more bits of said 
message by minimizing or maximizing functions of the 
form {f(c lt c 2r c 3f C4j I ■ 

2. The method of claim 1 wherein said func- 
tion f is of the form |kl*cl+k2*c2+k3 *c3+...kn*cn| where 
the values k are constants and at least 2 coefficients kl 
and k2 are non zero. 

3. The method of claim 2 where kl is 1 and k2 

is -1. 

4. The method of claim 1 where the function 
is |kl[ci|-k2|c 2 | | where the values k are constants. 

5. The method of one of the preceding claims 
applied to the spatial domain, where the local mean has 
first been removed. 

6. The method of one of the preceding claims 
wherein said coefficients Ci are constrained by values 
A i, j as determined by a masking function of the data set 
V{X). 

7. The method of claim 6 where bits are first 
grouped together and each possible bit group is assigned 
a function and the function corresponding to the bit 
group to be embedded is maximized while the others are 
minimized. 
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8. The method of claim 7 where Reed-Solomon 
codes or LDPC codes are used to encode the original mes- 
sage prior to embedding. 

9 . A method for decoding the watermark gener- 
ated by the method of one of the preceding claims (except 
claims 7 and 8) comprising the step of calculating the 
function | f {ci t C2 , C3 g C4.J | and comparing it to a thresh- 
old value T, and assigning a value of 1 if greater than T 
and 0 otherwise and then decoding error correction codes 
if necessary. 

10. The method of claim 9 where soft decod- 
ing is used based on the difference |f (ci / C2 / C3 / C4_) | - 
T. 

11. A method for decoding a watermark gener- 
ated by the method of claims 7 or 8 consisting of calcu- 
lating all the functions used at embedding and choosing 
the bit sequence associated with the function yielding 
the maximum value, and decoding the error correction 
codes if necessary after the entire bit sequence has been 
retrieved. 

12. A method for watermarking video data com- 
prising a plurality of consecutive video frames, wherein 
the method of one of the claims 1-11 is applied to at 
least some of said video frames. 

13. A method for watermarking a ID music sig- 
nal wherein the method of one of the claims 1-11 is ap- 
plied to at least some of said music signal. 
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